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PREFACE 


The  work  described  in  this  publication  was  performed  by  the 
Institute  For  Decision  Sciences  (IDS)  under  contract  to  the  Jet 
Propulsion  Laboratory,  an  operating  division  of  the  California 
Institute  of  Technology.  This  activity  is  sponsored  by  the  Jet 
Propulsion  Laboratory  under  contract  NAS7-918,  RE182 ,  A187  with  the 
National  Aeronautics  and  Space  Administration,  for  the  United  States 
Army  Intelligence  Center  and  School. 

This  specific  work  was  performed  in  accordance  with  the  FY-87 
statement  of  work  (SOW  #2). 
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SUMMARY 

i  The  fix  estimators  MARC  (Mathematical  Analysis  Research  Corportation)  has 
examined  in  fielded  systems  have  a  property  which  implies  that  these  fix  estimators 
approach  optimality.  Explaining  the  meaning  of  this  statement  and  the 
qualifications  that  go  with  it  is  the  purpose  of  the  main  body  of  this  report.  Proofs 
needed  arc  included  in  a  Math  Appendix. 

A  list  of  topics  covered  in  the  individual  sections  of  this  report  follows: 

I.  OPTIMAL  ESTIMATORS  IN  STATISTICS  ^ 

A.  Optimality  has  no  unique  definition 

B.  Bias 

C  Variance 

D.  Uniformly  Minimum  Variance  Unbiased  Estimators  (UMVUEs)  as 
optimal. 

E.  Cramer  Rao  Lower  Bound  as  a  means  of  finding  UMVUEs. 

F.  Differences  between  the  fixing  case  and  the  standard  case: 

In  the  fixing  case, 

1.  no  unbiased  estimators  exist 

2.  bearing  measurements  are  not  identically  distributed 

3.  estimating  location  involves  determining  more  than  one 
parameter. 

II.  FIX  MODELING  ASSUMPTIONS  - 

A.  Independent  Normally  distributed  bearing  error  with  common 
standard  deviation,  a. 

B.  Minimization  of  Squared  Angular  Error 

C  Small  errors  and  linearization 

D.  Discussion  of  weighting  as  means  of  linearizing  optimally 

E.  Ellipse  dependence  on  linearization 

F.  Optimality  being  relative  to  the  linearized  model 


III.  PROBLEMS  WITH  FIX  MODELING  ASSI 


TONS  - 


A.  Bias  resulting  from  the  non-linear  terms 

B.  Bearing  selection  interfering  with  the  fix  modeling  assumptions 

C  Possibility  of  dependent  bearings 

D.  If  o  is  known,  then  better  models  are  possible 


Bias  is  of  order  c2 

Cramer  Rao  Lower  Bound  result  can  be  generalized  to  order  cr2 
Which  methods  are  order  c2  optimal? 

Choosing  the  best  order  a2  optimal  method  without  considering  data 
storage  and  computation  time. 

How  much  does  choosing  the  best  order  a2  optimal  method  matter? 
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The  estimator  which  is  ’best'  in  any  particular  situation  depends  on 
the  nature  of: 

1)  The  information  available  to  be  used  by  the  estimator. 

2)  The  overall  reward/penalty  function  which  applies,  considering  the 
variety  of  behaviors  the  estimator  may  exhibit  and  the  applications 
to  be  made  of  it. 

Analysis  can  only  approximate  these  two  considerations.  What  statistics 
does  do  is  identify  measures  of  desirable  behavior  and  the  means  of 
optimizing  these  measures  given  idealized  input. 

B.  Bias 

Estimators  produce  different  results  depending  on  the  nature  of  the 
errors  in  the  observations.  The  size  of  the  error  in  the  fix  depends  upon 
chance.  In  practice,  only  one  fix  is  made.  It  is  useful  however  to  imagine 
averaging  all  of  the  fixes  that  chance  might  have  produced.  If  these 
potential  fixes  are  weighted  by  the  probability  of  their  occurrence,  then 
one  has  the  'expected  fix'.  If  the  'expected  fix'  turns  out  to  be  located  at 
the  true  emitter,  then  the  estimator  would  be  referred  to  as  unbiased. 

When  the  'expected  fix'  turns  out  to  be  located  away  from  the  true  emitter, 
exactly  how  far  apart  these  two  points  are  becomes  an  issue  of  concern. 

In  this  case,  the  estimator  is  referred  to  as  biased  and  the  distance 
between  the  points  is  called  the  bias. 

Estimators  used  in  practice  are  biased.  The  'expected  fix'  is  short  in 
range  of  the  true  for  the  most  commonly  used  estimator.  However,  the 
bias  (distance  short)  is  usually  small  (small  in  comparison  with  the 
variance). 

C.  Variance 

While  bias  is  the  distance  between  the  'expected  fix'  and  the  true 
emitter,  variance  is  a  means  of  measuring  the  average  distance  between 
the  'expected  fix'  and  individual  estimates.  If  the  variance  of  an 
estimator  is  large,  any  confidence  in  the  accuracy  of  the  estimate  is 
limited.  A  small  variance  indicates  an  estimator  which  will  consistently 
give  similar  estimates.  However,  variance  does  not  include  any  measure 
of  the  distance  between  the  'expected  fix'  and  the  true.  It  is  quite  possible 
to  have  an  estimator  with  small  variance  which  estimates  the  location  to 


■ 


be  a  significant  distance  from  the  true  position  of  the  emitter  owing  to 
bias. 


As  was  mentioned  above,  there  exists  no  unique  definition  of  what 
constitutes  an  optimal  estimator.  In  part,  when  defining  an  optimal 
estimator,  one  must  consider  the  constraints  of  available  information. 
Estimates  of  both  bias  and  variance  fall  within  these  constraints,  but 
minimizing  only  bias  without  considering  variance  (or  vice-versa) 
allows  no  control  over  the  size  of  the  variance.  The  optimal  method,  then, 
with  regards  to  both  bias  and  variance,  is  to  minimize  both.  Standard 
practice  has  been  to  minimize  the  variance  of  an  unbiased  estimator,  if 
such  an  estimator  can  be  found.  If  such  an  estimator  exists,  then  it  is 
known  as  the  uniformly  minimum  variance  unbiased  estimator  (UMVUE). 
Cramer  Rao  Lower  Bound  as  a  means  of  finding  UMVlTEs 

The  Cramer  Rao  Inequality  provides  a  lower  bound  for  the  variance 
of  an  unbiased  estimator.  Thus,  in  attempting  to  determine  which  is  the 
'best'  unbiased  estimator  among  the  set  of  unbiased  estimators,  the  Cramer 
Rao  Lower  Bound  can  be  utilized  as  a  measure.  Specifically,  if  the 
variance  of  an  unbiased  estimator  satisfies  the  Cramer  Rao  lower  bound, 
then  that  estimator  can  be  said  to  have  the  minimum  variance  among 
unbiased  estimators  (in  other  words,  a  UMVUE).  Such  an  estimator  would 
then  be  optimal  with  regards  to  bias  and  variance. 

Differences  between  the  fixing  case  and  the  standard  case 

In  our  case, 

1)  There  are  no  existing  unbiased  fix  estimators.  For  the  two  LOB  case, 

it  can  be  shown  that  when  a2  is  unknown,  it  is  not  possible  for  an 
estimator  to  be  unbiased.  For  three  or  more  LOBs,  however,  it  has  not 
yet  been  resolved  if  an  unbiased  estimator  can  exist.  Even  so,  the 
authors  of  this  report  are  unaware  of  any  existing  unbiased 

estimators. 

2)  Bearing  measurements  are  not  identically  distributed  due  to  the  fact 

that  they  all  have  different  means  (expected  values). 

3)  Usually,  finding  UMVUEs  requires  estimating  only  one  parameter. 

Yet,  estimating  location  requires  determining  both  an  'x'  and  a  *y' 
parameter  in  order  to  estimate  location. 
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’  -  ■  *.  v  ■-  \  \  V  V  ■. 


Optimality  is  measured  against  models  and  hence  one  must  be  careful 
about  modeling  assumptions  when  discussing  optimality. 

A.  Bearing  error  distribution 
The  assumptions  are: 

1)  Independence  -  This  means  that  the  size  and  direction  of  a  particular 
bearing  error  does  not  influence  the  size  and  direction  of  another 
bearing  error.  An  example  of  where  this  assumption  loses  validity  is 
when  error  is  induced  by  inaccurate  determination  of  True  North. 

2)  Normally  distributed  error  with  mean  zero  -  Normality  is  not 
absolutely  required.  Symmetric  error  with  mean  zero  is  the  most 
important  property  of  the  normal  distribution  that  is  needed. 

3)  Common  standard  deviation  -  Knowledge  about  differences  in  the 
standard  deviations  of  different  bearings  should  be  used  if  available. 
Unknown  differences  can  be  tolerated  so  long  as  the  difference  is 
not  too  severe.  Different  signal  to  noise  ratios  at  different  sensors  is 
an  example  of  what  might  signify  a  difference  in  angular  error 
standard  deviation. 

B.  Minimization  of  Squared  Angular  Error 

This  model  assumes  that  the  error  is  in  the  bearing  angle  as  opposed 
to  the  sensor  location.  Furthermore,  in  order  to  find  the  best  fix,  a  score 
is  computed  by  taking  the  angular  error  and  squaring  it.  Thus,  a  bearing 
error  three  times  as  large  would  be  nine  times  as  bad.  This  is  one  of  the 
consequences  of  assuming  normality.  One  'wild'  bearing  (from  another 
emitter  perhaps)  can  have  a  very  large  impact  because  of  this 
assumption.  If  a  higher  power  were  used,  then  outliers  would  determine 
the  fix  estimate  completely. 

C.  Small  errors  and  linearization 

Fix  algorithms  take  advantage  of  the  fact  that  angular  error  is  small. 


Linearization  of  angular  error  into  spatial  error  is  relatively  accurate 
because  angular  error  is  small.  If  this  were  not  true,  then: 

1)  Fixes  would  be  very  biased. 


2)  The  'Weighted  Perpendicular'  method  (the  name  used  in  MARC 
reports  for  the  most  frequently  used  method  of  approximating 
minimization  of  angular  error)  would  not  be  a  good  approximation  to  the 
Minimization  of  Squared  Angular  Error  Method  which  it  is  supposed  to 
behave  like. 

D.  Weighting  to  attain  optimal  linearization 

The  'Weighted  Perpendicular’  method  iteratively  reweights  data  on 
the  basis  of  the  latest  estimate  of  the  fix  location.  The  limitations  on 
reweightings  cf  updated  fixes  is  one  of  the  major  differences  between 
algorithms  (limitations  are  caused  by  memory  and  speed  limitations  of  the 
computer).  The  objective  of  the  reweighting  is  to  obtain  a  solution  in 
terms  of  the  best  linearization  of  the  angular  error  that  can  be  found. 

Not  all  algorithms  are  optimal  in  this  reweighting  sense  because  of 
memory  and  speed  limitations. 

E.  Ellipse  dependence  on  linearization 

The  ellipse  comes  from  a  quadratic  form.  That  quadratic  form  is 
baseu  on  the  covariance  of  a  linearization  (in  the  sense  of  a  Taylor  Series 
to  the  first  derivative).  If  angular  errors  were  not  small,  then  the  correct 
family  of  curves  to  use  to  indicate  error  regions  would  be  much  more 
difficult  to  manipulate. 

F.  The  optimality  of  this  report  is  relative  to  the  linear  approximation 

The  optimality  of  concern  in  this  report  is  qualified.  It  will  only 
assert  optimality  to  within  a  small  second  order  term.  This  qualification 
should  be  expected,  however,  as  most  of  fix  estimation  theory  is  based  on 
the  s..  .ne  qualification. 
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After  reading  the  previous  section,  it  should  not  be  surprising  that 
non-linear  terms  are  one  of  the  major  problems  with  fix  modeling 
assumptions.  Bias  is  the  most  significant  impact  of  non-linear  terms.  It  i< 
the  difference  between  where  you  would  expect  a  fix  to  be  located  on 
average  and  where  it  really  is.  When  non-linear  terms  are  small,  bias  is 
not  noticeable.  This  is  frequently  the  case.  Nonetheless,  bias  does  exist 
for  all  methods  except  under  very  special  circumstances.  The  direction 
and  size  of  bias  can  vary  from  method  to  method.  For  the  'Weighted 
Perpendicular'  method  with  three  or  more  bearings,  bias  can  be  shown  to 
point  shot  along  the  range.  For  other  methods,  bias  can  be  long. 

One  of  the  disadvantages  of  the  'Weighted  Perpendicular'  method  is 
that  bias  does  not  necessarily  get  smaller  as  the  sample  size  increases.  In 
fact,  bias  may  even  increase  with  sample  size.  If  a  large  amount  of  data  is 
going  to  be  used  in  a  fix,  there  are  better  methods.  Unfortunately,  the 
other  methods  require  storing  all  of  this  data. 


Bearing  selection  is  not  pan  of  the  fix  model  as  used  by  the  Army 
systems  reviewed  by  MARC.  Other  models,  such  as  those  which  add  a 
uniform  background  to  a  truncated  normal,  may  reflect  bearing  selection 

issues  but  MARC  has  not  investigated  this  aspect  of  the  question.  It  is 
clear,  however,  that  bearing  selection  can  interfer**  with  the  assumption 

of  normality.  There  are  many  possible  repercussions  of  this  loss.  For 

example,  minimizing  the  squared  error  may  not  be  optimal. 


If  bearing  errors  are  dependent  on  one  another  (for  reasons  such  as 
a  shared  error  in  determination  of  True  North),  then  fix  errors  should  be 
expected  to  be  larger  than  normally  predicted.  If  the  dependence  is 
strong  enough,  then  an  optimal  model  would  attempt  to  account  for  this 
dependence. 

D.  Is  cr  (Angular  Error  Standard  Deviation)  known  or  unknown 

The  current  models  are  ambiguous  with  respect  to  knowing  o2.  If  or2 
is  known,  then  it  would  be  possible  to  correct  for  fix  algorithm  induced 
bias.  Even  a  known  lower  bound  would  yield  a  lower  bound  correction. 
The  correction  would  be  suspect  if  a2  is  not  constant  across  bearings, 
however.  The  methods  of  a2  determination  that  have  been  reviewed  by 
the  authors  of  this  report  raise  more  questions  about  underlying  models 
than  they  resolve. 

IV.  OPTIMALITY  TO  WITHIN  TERMS  OF  ORDER  a2 
A.  Bias  is  of  order  a2 

Showing  that  bias  must  be  of  at  least  order  c2  (when  o2  is  unknown) 
for  particular  algorithms  is  simple.  Showing  it  for  all  algorithms  in 
general  is  less  so.  The  two  bearing  case,  however,  can  be  shown  to  be  of 
order  c2  for  all  algorithms.  Correcting  for  the  bias  of  the  intersection 
(the  two  bearing  case)  depends  on  knowing  o2  and  a2  cannot  even  be 
estimated  using  two  bearings.  Since  a2  can  be  estimated  with  three  or 
more  bearings,  it  seems  possible  that  .here  might  be  an  order  o4 


algorithm  that  could  be  constructed  in  these  cases.  The  authors  of  this 
report  know  of  no  such  algorithms  in  use.  Problems  in  the  estimation  of 


o2  suggest  to  the  authors  of  this  report  that  such  algorithms  would  have 
limited  application  in  any  case. 

Since  bias  of  order  a 2  exists  in  most,  if  not  all,  fielded  systems,  it 
seems  reasonable  to  ask  whether  fix  algorithms  are  optimal  to  within  a 
term  of  order  a2.  That  is  the  purpose  of  this  report.  Finding  the  method 
with  the  smallest  order  a2  term  is  a  reasonable  follow-up  question.  This 
question  has  already  been  addressed  in  part  for  a  broad  class  of  fix 
algorithms  in  MARC's  Two  Dimensional  Uncorrelated  Bias  in  Fix 
Algorithms  (12  March  87). 

B.  Cramer  Rao  Lower  Bound  can  be  generalized 

The  Cramer  Rao  Lower  Bound  is  a  theorem  which  gives  a  bound  on 
how  accurate  an  unbiased  estimator  can  be  (where  accuracy  is  measured 
by  the  size  of  the  variance).  Since  there  are  no  unbiased  estimators,  a 
generalization  of  the  Cramer  Rao  Lower  Bound  is  needed.  Such  a 
generalization  is  derived  in  the  Math  Appendix.  The  differences  between 


C. 


our  version  of  the  Cramer  Rao  Lower  Bound  and  versions  found  in  Wilk’s 
Mathematical  Statistics  are  : 

1)  The  result  in  this  report  is  for  order  ct2  biased  estimators. 

2)  The  result  in  this  report  is  for  2-dimensional  estimators  instead 
of  one-dimensional  estimators.  (A  partial  result  for  n- 
dimensional  estimators  is  also  in  this  report.) 

Which  methods  are  order  cr2  optimal? 

All  fielded  systems  that  MARC  has  reviewed  are  order  a2  optimal.  The 
only  theoretical  method  which  MARC  has  examined  which  is  not  optimal 
to  within  order  a2  is  the  unweighted  version  of  the  Perpendicular 
Method.  The  variance  of  all  other  methods  are  essentially  the  same. 
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There  is  a  difference  in  the  order  a2  term  of  various  methods.  The 
impact  of  this  order  a 2  term  on  bias  is  much  more  important  than  its 
impact  on  the  variance.  Minimization  of  Squared  Angular  Error  and 
methods  similar  to  it  have  a  smaller  order  a2  bias  than  the  more 
commonly  used  Weighted  Perpendicular  Method. 

Both  the  Minimization  of  Squared  Angular  Error  and  Weighted 
Perpendicular  Methods  can  only  be  used  in  their  intended  form  as  long  as 
bearings  are  saved.  After  eliminating  bearings,  both  methods  must  give 
way  to  the  suboptimal  version  of  the  Weighted  Perpendicular.  Therefore, 
the  only  justifications  for  use  of  the  Weighted  Perpendicular  are  greater 
simplicity  in  the  coding  of  the  algorithm  and  simplicity  of  computation. 
The  Weighted  Perpendicular  method  also  generalizes  easily  to  a  3- 
dimensional  method. 

How  significant  is  the  algorithm  induced  bias? 

It  depends  on: 

1)  Sensor  accuracy  (With  very  accurate  bearings,  the  bias  is 
insignificant). 

2)  The  number  of  bearings  kept  before  using  the  suboptimal 
approach.  With  two  bearings,  there  is  no  difference.  With 
four  or  more  bearings,  the  difference  is  of  more  interest. 

3)  The  range  of  compass  headings  over  which  bearings  arc 
taken  (excluding  bearings  taken  from  much  further  away 
than  the  nearest  sensors  to  the  emitter).  If  the  range  is  too 
small,  then  bias  can  be  very  significant. 

Furthermore,  if  error  originates  from  other  sources  than  bearing 


accuracy  such  as  sensor  location  error,  then  both  Weighted  Perpendicular 
and  Minimization  of  Angular  Error  are  less  than  optimal.  It  is  this  last 
consideration  that  has  Jed  the  authors  of  this  report  to  conclude  that 
changing  from  Weighted  Perpendicular  to  Minimization  of  Squared 
Angular  Error  need  only  be  of  concern  where  location  estimates  can  be 
shown  to  have  been  short  in  range  on  average  (This  is  the  form  of  bias 
which  the  Weighted  Perpendicular  exhibits).  Even  in  this  case,  there  may 
be  other  causes  besides  the  difference  between  these  two  algorithms. 
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MATH  APPENDIX 


DEFINITIONS: 

Let  0^  »  ith  piece  of  observed  data  (bearing  from  sensor  i 

in  our  application). 

Let  (a1,...,otn)  be  the  true  location  in  n-space  (of  the  emitter  in 

our  application). 

Let  (XI, X2 . Xn)  =  (XI  ( 61  , e2 , . . . , em) , . . . ,Xn(61  , 02 . 6m))* 

Let  (X1,...,Xn)  be  estimators  of  (al . an)  such  that 

E{  (XI  , . . .  ,Xn) }  =  (al . an)  +  (0(a2) . 0(a2)) 

where  0(h(o))  is  such  that,  k  *  0(h(o))  means  that  as  o  approaches 
some  limit,  k(o)  is  dominated  by  some  positive  constant  multiple 
of  h(o).  (See  Olmsted,  John  M. ,  Real  Variables  ,  Apple  Century 
Crofts,  Inc.,  1956,  pg.  169) 

Let  f (0^ , . . . , 0  ; (al , . . . ,an) )  be  the  probability  density  function  of  the 
continuous  random  variable  Q. 

Let  D  denote  a  directional  derivative  in  the  direction  (^.....u  ). 

Let  L  *  -log(f)  with  (al . an)  evaluated  at  (XI,..., Xn)  as  the  defining 

equation  for  the  (XI,... Xn)  vector  (i.e.  Minimize  L  or  determine 
the  critical  point(s)  of  L  with  respect  to  (XI , . . . , Xn) ) . 

Let  S  =  Dlog(f) . 

Let  COV  =  Cov(X1 , . . . ,Xn) . 

Let  MSE^  *  E{ (Xi-ai) (Xj-aj ) }  =  Mean  Squared  Error. 

The  above  definitions  are  utilized  in  seven  important  and  interrelated 
results. 


RESULT# 1 


Proof: 


COV  =  MSE  +  0(  cr ) 


COV.^  -  E{(Xi-E[Xi])(Xj-E[Xj])} 

-  E{(Xi-ai+E[Xi~ai])( Xj-aj  +E[Xj-a j ] ) } 

=  E{ ( Xi-ai ) ( Xj-aj ) }  +  E{ (Xi-ai)E[Xj-aj]}  +  ...  ♦  E{ Xi-ai ) E{ Xj-aj ! 
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-  MSE  +  0(o4) 

RESULT//2 

The  directional  variance  satisfies 

(u . u  )  MSE  (u . u  )T  £  1+0(o2) 

1  n  in  E{ (S)a }“ 

Equality  holds  if 

u  •  ( XI -al , X2~a2 , . . . , Xn-an )  is  proportional  to  S 
Proof: 

(u1  ,...,un)  =  D(a1 . an)  =  D (xl . xn)  f  +  D(0(o2) . 0(o2)) 

=  D (xl . xn)  f  -  (al , . . . ,an)D/. . ./  f 

+  (0( a2) , . . . ,  0(o2 ) ) 

=  (x1,...,xn)  D[f]  -  (al , . . . ,an)/. . D[f] 

+  (0(o2) .... ,0 (o2) ) 

«  /.../  [ (xl , . . . ,xn)-(a1 . an)]  D[f]  +  (0 (a2) . 0(a2)) 

=  E{[(X1,...,Xn)-(a1,...,an)](S)}  +  (0(o2) , . . . ,0(o2) ) 

Taking  the  dot  product  on  both  sides  with  (u^,...,u  )  and  squaring, 

I (u1 , . . . , un)*(u1 , . ,un)  +  0 ( a2 ) | 2 

*=  [E{[(u  ,...,u  )  ( XI  -al  , . . . , Xn-an)  ](S) }  ]2 
1  n 

Now,  by  the  Cauchy-Schwarz  inequality 

(u1 , . . . ,un) • (u1 . un)  +  0(o2) | 2 

£  E{[(u  ,...,u  ) (XI -al , . . . , Xn-an) ]2}E{ (S)2} 

1  n 

«  (^.....u  )  MSE  (Ul,...,Un)T  E { ( S ) 2 } 

DEFINITIONS  AND  ASSUMPTIONS  APPLYING  TO  THE  NORMAL  CASE: 

Assume  the  0^  are  independently  Normally  distributed  with  mean  , 

2 

6  (al , . . . , an)  and  standard  deviation,  o  . 

Further,  assume  that  the  function  associated  with  the  mean  above  may  be 
computed  at  the  estimate,  i.e.  6^X1 . Xn). 


Finally,  assume  that  Xi  evaluated  at  (a.,...,a  )  is  a.  (i.e.  The  true 

1  n  1 

parameter  is  computed  if  there  is  no  'observation  error'  and  the 
mean  observation  is  the  no  error  case). 

Let  *  0^  -  0^(a1 , . . . ,an) .  In  some  cases,  this  may  be  interpreted  as 


( e,  . . .  c  )=Xi(  9.  ,  8«. . .  0  )=Xi(e1  +0.  (al  . .  .an) . e  +0  (a1...an)) 

i  l  n  l  d  n  in  nn 

6,  denotes  the  partial  derivative  with  respect  to  e.  evaluated  at 
,£k  k 

£^=0  for  all  i. 

0.  .  denotes  the  partial  derivative  with  respect  to  aj. 

i,aj 

MSE^  represents  the  Mean  Standard  Error  matrix  for  the  Least  Squares 
approach  (i.e.  Minimizing  S2). 

TAYLOR  SERIES  EXPANSION 

Expanding  {g^e^Cg . ej  , . . . , gn< , £g . cffl) } 

=  (al , . . . ,an)+E[g  . g  ]e  +EE0(e  £  ). 

1*ei  n'ei  1  1  J 


RESULT#3 

Let  0^  have  an  independent  Normal  distribution  with  mean,  ©^(al 
and  variance,  a2. 


.  ,an) 


E{(S)2}  =  [1/o2]  I  [ u  • . . u  ] 

i  n 


(6i,a1)2 


e .  0 .  , 

i , an  l , al 


0 .  .0. 

i,a1  i,an 


(0,  nn)2 

l ,  an 
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Proof: 


S  =  E{ ([6 .-0 . (al . . .an) ]/o2)0,  .  u.  -  ... 

ii  i  ,  a  I  1 

+  ([6, -0 . (al . . .an) ]/o2)0 .  _  u 

i  l  i,an  n 

*=  E  { ( [  0  . -0  .  (al  . .  .an)  ]/a2)  (  0  .  .  u.  *  .. 

ii  i ,  a  i  1 


.  +  0 .  „  un ) } 

i ,  an  n 


Therefore, 


i 

i 


& 

& 
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E  { ( S )  ^ } 

=  E{Il[c  /o2][e./o2][8i  ,  u  +  ...  +  9.  u  ][0.  u  +  ...  +  6.  _ 

1  j  i,al  1  i,an  n  j,al  1  j,an 

=  [I/o^XEIEKe.e  .)(6.  .  u.  +  ...  +  6.  u  )(6.  .  u.  +  ...  +  6. 

a  j  l  ,a1  1  i,an  n  j.al  1  j,a 

*  [i /o2][z(e .  ,  u,  +  ...  +  0.  u  )(e.  ,  u.  +  ...  +  e.  u  )] 

i,a1  1  i,an  n  i,a1  1  i,an  n 

RESULT#^ 

Let  0 ^  have  an  independent  Normal  distribution  with  mean,  e^Cal 
and  variance,  o2. 


(6,  ,)2  ...  6.  .9.  -1 

i,a1  i,a1  a, an 


MSEl  -  [o2]  I 


.,a1  6i,< 


( ®i ,an^ 


Proof : 


L  =  Z(9  -e  (XI  . .Xn))2/(2o2)  +  (n/2)ln(2ro2 ) 

Recall  that  g  is  evaluated  at  the  true  location. 
k*ei 


♦  0 (o4) 


The  Mean  Standard  Error  matrix  is  by  definition 


(XI -al )2 


(Xl-al )(Xn-an) 


MSE  =  E{ 


(Xl-al ) (Xn-an) 


(XI -an)2 


E{II(g  >  •  •  •  i  g  _  )T(g, 

I  I  t.  .  1 1  ,  t  .  I 


j • • • i B  /E.E. 

ej  n>ej  1  J 


)  E  .  E  .5  +  O(o^) 


=  Kg  ,...,gn  r  )T(g1  r  . gn  „  )o2  +  0 (au) 

1 i  1  » £  ^  n, e  ^ 

Recall  that  (g  ,...,g  )  are  the  (XI, _ ,Xn)  determined  where  L 

in  x  i 


Differentiating  the  defining  equations  with  respect  to  e. 
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LXj  “t=?(_2)(et  Xj)(6t(X1 . Xn)-et)/(2o2) 

LXjXk  =t=?(~2)'-(6t,Xj)(6t,Xk)  +  (et,XjXk^et(X1 . Xn)-0t)]/(2o2) 

=tS^(~2)  ( et  Xj)(8t  Xk)/(2o2)  (Recall  that  £^=0) 

Since,  (2o2)LXj  -,.^(-2)  (0t>XJ)Cet(Xl . Xn)-0t(a1 . on)+0t(a1 . an)-0t] 

*t!i1('2)(6t,xj)Cet(x1*--->Xn)"9t(Q[1 . an)]  +tll2(6t  Xj)Gt 

4je  '  "28i,Xj  /<2!’’) 


Therefore , 


However,  g  is  evaluated  at  the  true,  which  when  done  produces  the  desired 
result 


(e.  .)2 

i,a1 


( e .  .Me.  ) 

i,al  i,an 


-1 


(0,  n)(8,  ,) 

i,an  i,a1 


(0.  )2 


1 ,  an 


RESULTS 

^  +  0(a2) 

u>  MSE  (u>)T  2  where  u>  «  (u, ,u_,...,u 

G>  MSE"1  (G>)T  1  2 

Proof: 

From  Result  #  2 

(u . ,u  )  MSE  (u  ,  . . . , u  )T  >  1+0(o2) 


From  Result  #  3 


E{  (S)2}=[1  /  o2]Z  [ur..un] 


( 6 }  .)2 

l  ,a1 


e .  e .  . 

i,an  i,a1 


e .  ,  e . 

i.al  i,an 


(6,  )2 
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From  Result  #  4 


MSEl  -  [o2]  £ 


(e.  , )2  ...  e.  , e. 

i.al  i,al  i,an 


6i,al0i,an  ***  (0i,an) 


-1 


+  0(<T) 


RESULT#6 

Maximum  eigenvalue  of  MSE^  <  Maximum  eigenvalue  for  other  methods 
Proof : 


Let  A^  be  the  maximum  eigenvalue  of  the  L  method. 
Let  A 


other 


be  the  maximum  eigenvalue  of  some  method. 


*v*:jT*m 


Let  v>  be  the  eigenvector  associated  with  X^. 


v>  MSE"1 2 3  (v>)T  1/X,  L 

Lj  Li 

From  Result  it  5,  it  follows  that 

X,  <  v>  MSE  „u  (v>)T 
L  other 

Finally,  since  MSE  ^  is  a  positive  definite  matrix,  v>MSE  ..  (v>)'r  is 

other  other 

between  the  minimum  and  minimum  eigenvalues  for  MSE  ..  .  Thus, 

°  other  ’ 


X,  S  A  .. 

L  other 


RESULT#7 


Minimum  eigenvalue  of  MSE  <  Minimum  eigenvalue  for  other  methods 

Li 


Proof : 


Let  X  be  the  minimum  eigenvalue  of  some  method, 

other  ° 

Let  X  be  the  minimum  eigenvalue  of  the  L  method. 

L 


Let  v>  be  the  eigenvector  associated  with  X 


other- 


max  v>  MSE  1  (v>)T 

Li 


£  v'  MSE 


other 


(v>)T  =  X 


other 


Therefore , 


Recall  Result  it  5 


X,  *  A 
L  other 

Observations  -  Comparing  error  ellipses  between  the  L  and  any  other  estimator 
at  the  same  confidence  level  we  note: 

1)  The  largest  axis  of  the  L  error  ellipse  is  smaller  than  the 
largest  axis  of  the  other  estimator. 

2)  The  smallest  axis  of  the  L  error  ellipse  is  smaller  than  the 

smallest  axis  of  the  other  estimator. 

3)  If  one  is  only  in  two  dimensions,  1)  and  2)  are  enough  to  imply  that 
the  L  error  ellipse  is  smaller  in  size. 


In  n  dimensions,  intermediate  axes  must  also  be  considered  in  any  discussion 
of  error  ellipse  size,  thus  1)  and  2)  are  not  sufficient  to  show  that  the  L 
error  ellipse  is  smaller  in  the  general  case. 


